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Abstract 



Identifying causal relations among simultaneously acquired signals is an important 
problem in multivariate time series analysis. For linear stochastic systems Granger 
proposed a simple procedure called the Granger causality to detect such relations. In 
this work we consider nonlinear extensions of Granger's idea and refer to the result 
as Extended Granger Causality. A simple approach implementing the Extended 
Granger Causality is presented and applied to multiple chaotic time series and other 
types of nonlinear signals. In addition, for situations with three or more time series 
we propose a conditional Extended Granger Causality measure that enables us to 
determine whether the causal relation between two signals is direct or mediated by 
another process. 
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1 Introduction 



Given the deluge of multi-channel data generated by experiments in both 
science and engineering, the role of multivariate time series analysis, espe- 
cially nonlinear time series processing, has become crucial in understanding 
the patterns of interaction among different elements of nonlinear systems. In 
particular, identifying causal relations among signals is important in fields 
ranging from physics to biology to economics. One approach to evaluating 
causal relations between two time series is to examine if the prediction of 
one series could be improved by incorporating information of the other. This 
was originally proposed by Wiener [1] and later formalized by Granger in the 
context of linear regression models of stochastic processes [2]. Specifically, if 
the variance of the prediction error of the second time series at the present 
time is reduced by inclusion of past measurements from the first time series in 
the linear regression model, then the first time series is said to have a causal 
influence on the second time series. The roles of the two time series can be 
reversed to address the question of causal influence in the opposite direction. 
From this definition it is clear that the flow of time plays a vital role in making 
direction related inference from time series data. 

Since Granger causality was formulated for linear models, its direct applica- 
tion to nonlinear systems may or may not be appropriate, depending on the 
specific problem. In some cases, the linear Granger causality is able to identify 
the correct patterns of interaction for multiple nonlinear time series, but in 
some other cases, as will be shown later in this paper, it fails to do so. We 
deal with this issue by extending Granger's idea to nonlinear problems. Our 
starting point is the standard delay embedding reconstruction of the phase 
space attractors. Clearly, a full description of a given attractor requires a 
nonlinear set of equations. But, locally, one can approximate the dynamics 
linearly. Applying Granger's causality idea to each local neighborhood and 
averaging the resulting statistical quantity over the entire attractor results in 
Extended Granger Causality Index (EGCI). We examine the effectiveness of 
this idea on numerically generated nonlinear time series with known patterns 
of interaction. 

Works related to the identification of interdependence in nonlinear systems 
have appeared in the literature [3,4,5,6,7]. Particularly relevant for the work 
in this paper are works based on delay coordinate embedding reconstruction 
of phase space. Along this direction a number of methods of detecting non- 
linear interdependence or coupling based on nonlinear prediction theory have 
appeared in the past few years [8,9,10,11,12,13,14,15,16,17]. The basic ideas 
in these papers are similar, all involving the use of the points in a neigh- 
borhood in the reconstructed space to predict future dynamical behavior. In 
[9,10,11,12,13,14,15,16], time indices of neighborhood points in the space X 
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reconstructed from one time series (x) are used to predict the dynamics in 
space Y reconstructed from the second time series y. If this prediction is good 
enough, then it implies a dependence from x to y. Similarly, the reverse depen- 
dence can be found. Different authors define different criteria to quantify the 
goodness of the prediction, but the common assumption that nearby points 
in one reconstructed space X map to nearby points in another reconstructed 
space Y is adopted. These methods tend to detect strong interactions such 
as synchronization, phase synchronization or generalized synchronization. In 
order to detect weak interactions, a modification [17] was made by presenting 
a mixed-state prediction method where a reconstruction of mixing two time 
series was employed. It is important to note that all these nonlinear prediction 
based methods employ the same kind of predictor (a zeroth order predictor) 
which takes the mean or weighted mean as the prediction value. Since points 
in a given neighborhood come both from the past and the future of the ref- 
erence point this kind of prediction does not account properly for the flow of 
time. Our idea differs from the previous methods in two main respects: (a) 
an linear regression predictor is employed for each local neighborhood and (b) 
as a consequence the flow of time is explicitly incorporated in the predictor 
which is an essential element of inferring causal relations in multiple time se- 
ries [2]. A nonlinear approach that shares a number of similarities with ours 
has appeared in [5]. 



2 Theory 

In this section we will first review the basic idea of Granger Causality for- 
mulated for analyzing linear systems and then propose a generalization of 
Granger's idea to attractors reconstructed with delay coordinates. 

Granger Causality: The method of detecting causal relations among multiple 
linear time series is based on linear prediction theory. For a stationary time se- 
ries x(t), consider the following AutoRegressive (AR) prediction of the current 
value of x(t) based on m past measurements: 



Here e x (t) is the prediction error whose magnitude can be evaluated by its vari- 
ance var{e x {t)). Suppose that simultaneously we have also acquired another 
stationary time series y(t). Consider the following prediction of the current 



m 



X (t) = a 3 X (t ~ j) + £ x{t)- 
3=1 




4 



value of x(t) based both on its own past values and the past values of y(t): 

m m 

x(t) = £ a jX (t - j) + £ b jy (t - j) + e xly (t). (2) 

3=1 3=1 

If the prediction improves by incorporating the past values of y(t), that is, 
var(e x \ y (t)) < var(e x (t)) in some suitable sense, then we say that y(t) has a 
causal influence on x(t). Similarly, we may consider 



y(t) = J2Pjy(t-j)+e y (t), (3) 

3=1 

m m 

y(t) = E c J x ( f - j) + E d jy(t - j) + £ v Ut). (4) 

3=1 3=1 

and say that x(t) has a causal influence on ^/(t) if var(£j,i x (i)) < var(s y (t)). We 
note that Eqs. (2) and (4) together form the following Vector AutoRegressive 
model (VAR): 



X (t) = ^jX{t - j) + E hvi 1 - j) + £ x\y(t), 

(5) 

mm V / 

= E - i) + E d if (* ~ i) + £ v\x(t), 

3=1 3=1 

where standard techniques exist to estimate such models from time series data. 

Extended Granger Causality: Consider two nonlinear time series x(t) and y(t). 
The joint dynamics is reconstructed with the following delay vector [18,19] 

z(*) = (x(*) T ,y(*) T ) T , (6) 

where x(t) = (x(t),x(t - n), • • • , x(t - {m l - l)ri)) r , y(t) = (y(t),y(t - 
T2), • • • , y(t — (m 2 — l)r 2 )) T , m, is embedding dimension and r, is time delay 
for i — 1,2. Usually, the embedding dimensions and time delays for different 
series can be different. However when we investigate Granger causality, the 
time delays must be equal so that causal inferences can be made. Hereafter 
we take T\ = r 2 = r. 

In the delay embedding space, there exists a function f that maps a given 
point z(t) to its observed image z(t + r). Usually, this function has no known 
analytical form but can be locally approximated by a linear map around some 
reference point [20,21]: z(t+r) = Az(t)+r(t), where A is (mi+m 2 )x(mi+m 2 ) 
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coefficient matrix which can be determined by the least-squares technique and 
r(t) is the error vector. Substituting Eq.(6) in the above relations, we get the 
following equations: 



x(t + r) 



\y(t + r) 





+ A 2 



x{t — (m — l)r 



(7) 



(0 „« 



where A,- = 



11 


fl 12 


21 


(0 
°22 



y(t - (m - l)r) 
, and e y \ x are the error terms, and we have 



assumed mi = m-i = m for simplicity. If mi 7^ m,2, some diagonal terms of A« 
would turn out to be zero. 



We note that Eq. (7) is just another form of Eq. (5) for non-unit time step. 
Therefore, in Eq. (7), £ x \ y (or e y \ x ) actually gives the prediction error of x (or 
y) after incorporating y (or x). To proceed further, we also reconstruct each 
series independently around the x and y parts of the same reference point 
using linear regression approximations to obtain 

mi 

x(t + r) = ®jx{t + (j - l)r] + e x , 

Zl (8) 

y{t + T) = Y J l3 3 y[t + {3-i)r]+e y . 

i=i 



We can now apply the ideas from Granger causality to these local linear sys- 

_ . , . , var(e x \ v ) . var(e v \ x ) 
terns. Thus, it the ratio of the errors (or t^V) is less than 1, 

var(e x ) var(e y ) 

it implies y (or x) has causal influence on x (or y). So far, this procedure 
only involves data in one local neighborhood around a given reference point. 
Clearly, for nonlinear systems the coefficient matrix in the linear approxima- 
tion is a function of the local neighborhood. We repeat the process above for 
a set of chosen neighborhoods scattered over the entire attractor and average 
the error ratios from all the neighborhoods to obtain the Extended Granger 
Causality. See below for a more formal formulation. 

The idea above is actually very similar to other ideas of detecting directional 
interdependence based on nonlinear mutual prediction [9,10,11,12,14,15,16,17]. 
The difference is that the previous work used the average or weighted average 
value of the images of the points in a given neighbor as the basis for predic- 
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tion. Thus they suffer from the lacuna that points in the neighborhood of the 
reference point have no explicit time relations to the reference point itself. 
On the other hand, we employ a linear model which preserves explicit time 
relations and are therefore able to derive unambiguous causality relationships. 

Summarizing, we propose the following four-step procedure to evaluate causal 
relations between two nonlinear time series: 

(1) Reconstruct the attractor using the delay coordinate embedding tech- 
nique [cf. Eq. (6)]. 

(2) Fit an autoregressive model for all the points in the neighborhood G of 
a reference point z in the reconstructed space R m i+ m 2 ; w h ere @ — | z ; 
|z — z | < 5}. 

(3) Perform the reconstruction and fitting process on the individual x and y 
time series in the same neighborhood and compute the error ratio defined 
earlier. Average the error ratio over a number of local neighborhoods in 
order to sample the full attractor adequately. Compute the Extended 

Granger Causality Index (EGCI) defined as A„_> x = (1 — -r^r ), 

var(e x ) 

where (•) stands for averaging over the neighborhood sampling the entire 
attractor. 

(4) Compute EGCI as a function of the neighborhood size 5. For linear sys- 
tems this index will stay roughly the same as 5 becomes smaller. For 
nonlinear systems this index, in the small 5 limit, reveals the true nonlin- 
ear causal relation which may or may not be captured at the full attractor 
level (i.e. taking 5 to be the size of the whole attractor). 

To reconstruct the attractor, the embedding dimension and time delay have 
to be determined. Usually the embedding dimension is determined by the 
false nearest neighbor technique [22] and the time delay is obtained as the 
first minimum of the mutual information function [23]. If the reconstructed 
attractor is a fixed point with some noise, then a criterion such as AIC [24] for 
linear stochastic processes can be used to determine the order of the model. A 
difficult issue for the present work is to determine the optimal neighborhood 
size 5. The number of points in the neighborhood should be large enough 
to establish good statistics. On the other hand, the size of the neighborhood 
should be small enough so that linearization is valid. In step (4) above we seek 
a compromise by examining the EGCI as a function of 5 in the attempt to 
get to the true nonlinear effect when 6 becomes small enough. We refer to this 
step as a zooming-in procedure. For sufficiently large dataset, the usual rule 
is that, the smaller the neighborhood size, the better the nonlinear prediction 
achieved [14,15,17]. 

Conditional Extended Granger Causality: The above analysis for two time se- 
ries can be extended to more than two time series by analyzing them pairwise. 
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However, pairwise analysis of more than two time series cannot detect indirect 
causal influences, an issue that has been addressed in linear time series anal- 
ysis [25]. For example, consider three time series A, B and C. Two possible 
causal relations among them are shown in Figures 1(a) and (b). In Figure 
1(a), the causal influence or driving from A to C is indirect and mediated by 
B. In Figure 1(b), both direct and indirect influences exist. Pairwise analysis 
would show an arrow from A to C and thus cannot separate these two cases. 
We propose the following procedure for the case of three time series which, 
as we show in the next section, is able to reveal the true patterns of causal 
interactions. 

Suppose SA(t), s B (t), sc(t) are the given three time series, we reconstruct vec- 
tor z in whole space as [19]: 

z(t) = (s A (t) T ,s B (t) T ,s c (t) T ) T , (9) 



where s A (t) = (s A (t), s A (t-r), ■■■ , s A (t-(mi-l)r)) T ', s B (t) = (s B (t), s B (t- 
r), • • • , s B (t-(m 2 -l)T)) T , s c (t) T = (s c (t),s c (t-T), • • • , s c (t-(m 3 -l)r)) T . 
Then the vector autoregression obtained from a local linear approximation is 
given by 



mi i7»2 

SA(t + r) = J2 4is A [t - (i - l)r] + ]T a$s B [t - (i - l)r] 
i=i i=i 

I7»3 



1=1 



+ a i3 s c[t ~ (i - 1)7"] + £a\bc, 



mi rri2 

+ t) = J2 a$s A [t - (i - l)r] + ]T a®s B [t - (i - l)r] 
i=i i=i 

m 3 

+ a 23 s c[t ~ (i ~ 1)t] + £b\ac, 

i=l 

mi m2 

sc{t + r) ='£a$s A [t - (i - l)r] +'£c$s B [t - (i - l)r] 

i=i i=i 

7713 



(10) 



(») 

1=1 



+ a 33 s c[t ~ (i - 1)t] + £c|Ai?- 



The term Ec\ab is the prediction error of the series s c after incorporating 
both and s B . If this prediction is no better than the prediction obtained 
by incorporating only the series s B , it means has no direct causal influence 
on sc- Therefore we can define a Conditional Extended Granger Causality 

Index (CEGCI) as A a ^ c \b = ( 1 - ^ , 1 , )■ This can be used to dis- 

var(e C | B ) 

tinguish between direct and indirect causal relations. Conditional causality 
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indices between other time series pairs can be similarly defined. 



It is worth mentioning that for more than three time series, if the causality 
between any two time series is indirect, then taking one additional series or 
more than one additional series in the causality chain as the conditional one(s) 
will not make the results any different. Therefore analysis of three time series 
is sufficient to detect the intrinsic causal influences in any multiple time series 
system. 



3 Numerical Simulations and Discussion 

In order to make the whole discussion concrete, we study some examples. The 
number of reference points around the attractor is 100 for all the examples. 

Example 1: Let's consider two time series generated from unidirectionally cou- 
pled 2D maps. Two different coupling schemes, one linear and one nonlinear, 
are examined. The system with linear coupling is written as 



x(n) = 3Ax(n - 1)(1 - x 2 (n - l))e~ x2{n ~ 1] + 0.8x(n - 2), 

y(n) = 3Ay{n - 1)(1 - y 2 {n - l))e^ 2(n " 1) + 0.5y{n - 2) + cx{n - 2), 



x(n) = 3Ax(n - 1)(1 - x 2 (n - l))e" a;2(n " 1) + 0.8x(n - 2), 

y(n) = 3Ay(n - 1)(1 - y 2 {n - l))e~ y2{n - l) + 0.5y(n - 2) + cx 2 {n - 2)." 



It is obvious that y is driven by x in both systems. In order to make the 
simulations realistic, some system noise and measurement noise are added to 
the time series. The attractor reconstructed from the x time series is given in 
Figure 2(a). Figures 2(b) and 2(c) give the reconstructed attractors from the 
y time series driven linearly and nonlinearly by x with the coupling strength 



Both these cases are analyzed using our procedure in the previous section with 
m = 2 and r = 1. We obtain the EGCI as a function of the neighborhood size 
6 in Figure 3. For both linear driving [Fig. 3(a)] and nonlinear driving [Fig. 
3(b)], A y ^, x (shown as the thicker curve) is seen to be zero. Thus x is not 
influenced by y as expected from the construction of our model. In Fig. 3(a), 
A x -> y is non-zero starting from large neighborhood sizes 5 and increases as 5 
decreases. This implies that x has a causal influence on or drives y. Further, 




and the system with nonlinear coupling is 



c = 0.5. 
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since A x ^ 9 is non-zero even for large 6 values, this means that even a linear 
causality analysis would have detected the correct causal relations in this case. 
On the other hand, in Figure 3(b) (for nonlinear driving), A x ^ y becomes non- 
zero only when 5 is sufficiently small. In this linear causality analysis 
would fail to detect the correct pattern of driving. This example illustrates 
the importance of nonlinear causality analysis in such cases. 

Example 2: We consider two time series generated by two coupled two-dimensional 
ODEs where the fixed point in the origin is stable: 

x\ = — 0.25zi + x 2 — xl, 

X2 — X\ — X2 — X\X 27 

y\ = -0.25yi + y 2 - y\ + cx\, 
2/2 = 2/i -2/2-2/12/2- 

Adding some system noise and measurement noise and taking x\ and y\ as 
the acquired signals, we get two modified time series x and y. Clearly in this 
case x series drives y series. Reconstructing the attractors from these two 
time series with r = 2, finding the neighborhood of every reference point 
and fitting a second order AR model (m = 2) in every neighborhood, we 
obtain the Extended Granger Causality Index (EGCI) as a function of the 
neighborhood size 5 for different coupling strengths as shown in Figure 4(a). 
We make three observations. First, A y -> x 0, whereas A x -> y is non-zero, 
clearly demonstrating that the x series drives the y series but not vice-versa. 
Second, the level of EGCI is proportional to the coupling strength. Third, 
since the processes here are linear, the EGCI is not a function of 5. 

Example 3: Next we look at an ODE system exhibiting chaotic behaviors. The 
following two coupled Rossler oscillators are considered: 

xi = -(2/1 + 24), 
y\ = x 1 + 0.2t/i, 
z 1 = 0.2 + z 1 (x 1 -4.7) 
^2 = -(2/2 + z 2 ) +cx 1 , 
y 2 = x 2 + 0.2y 2 , 
i 2 = 0.2 + z 2 (x 2 -4.7) 

As done earlier, some system noise and measurement noise are added to the 
two time series x\ and x 2 to obtain x and y time series. Reconstructing the 
attractors with m = 3 and r = 2 and fitting linear models in every local 
neighborhood, we obtain EGCI shown in Figure 4(b). It is seen that x has 
a causal influence on y as expected. Besides, A x ^ y is non-zero even for large 



(13) 



(14) 
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neighborhood sizes. Thus, a linear causality analysis would detect the correct 
causal relations in this strongly nonlinear system. 

Example 4-' To show how to distinguish the pattern of interaction shown in 
Figure 1(b) from that shown in Figure 1(a), let us consider three time series. 
Both linear systems and nonlinear systems are considered. 

For a linear stochastic system the following three coupled AR(1) models are 
considered: 



x(n) = 0.2x(n — 1) + e±, 

y (n) = 0.5y(n - 1) + 0.5x(n - 1) + e 2 , (15) 
z(n) = 0Az(n - 1) + 0.3y(n - 1) + cx(n - 1) + e 3 . 



For chaotic time series, the following three coupled 1-d maps are considered: 
x(n) = 3Ax(n - 1)(1 - x 2 (n - l))e~ x ^ n - 1] , 

y{n) = 3.4y(n - 1)(1 - y 2 {n - l))e' y2(n -^ + 0.5x(n - 1), (16) 
z(n) = 3Az(n - 1)(1 - z 2 (n - l))e" 22(n " 1) + 0.3y(n - 1) + cx(n - 1). 

Here x, y and z correspond to A, B and C in Figure 1. In addition, c = 
simulates the indirect causality case (Fig. 1(a)) and c = 0.5 simulates the 
direct causality case (Fig. 1(b)). 

Numerical results for the linear case are shown in Figure 5 and results for the 
chaotic time series are shown in Figure 6. Figures 5(a) and 6(a) display results 
obtained using pairwise analysis. Similar plots are obtained for both c = and 
c = 0.5. As we can see, based on just pairwise analysis, one would conclude 
that Figure 1(b) is the pattern of interaction for both c = and c = 0.5. 
That is, the direct and indirect causal relationships cannot be separated by 
pairwise analysis alone. Figures 5(b) and 6(b) give the results obtained by 
simultaneously analyzing all three time series using conditional EGCI. In this 
case, for c = we obtain A x ^ z \ y fa (solid curve) indicating that no direct 
causal relation exists between x and z. Thus the correct causality graph [Figure 
1(a)] is obtained. On the other hand, for c = 0.5, A x ^ z \ y is non-zero (dotted 
curve) indicating direct causality between x and z and the causality graph 
shown in Figure 1(b) is obtained. 
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4 Conclusions 



We have extended the Granger causality theory to nonlinear time series by 
incorporating the embedding reconstruction technique for multivariate time 
series. A four-step algorithm was presented and used to analyze various linear 
and nonlinear coupled systems. The following conclusions were found: 

(1) Linear Granger causality analysis may or may not work for nonlinear time 
series. On the other hand, our method of applying the Extended Granger 
Causality Index to nonlinear time series always gives reliable results. 

(2) When three or more time series have to be analyzed, the Conditional 
Extended Granger Causality Index proposed here can distinguish between 
direct and indirect causal relationships between any two of the time series. 
This is not possible using simple pairwise analysis. 

As with other methods for analyzing nonlinear time series, the amount of data 
required for reliable analysis can be large. Possible improvements along this 
direction are being studied. 
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Figure Caption 



Figure 1: Two patterns of causal interactions, (a) A drives C by way of B 

and (b) There is a direct pathway from A to C. 
Figure 2: Reconstructed attractors from time series from Example 1. (a) 

Driving attractor; (b) linearly driven attractor; (c) nonlinearly driven at- 

tractor. 

Figure 3: Extended Granger Causality Index (EGCI) as a function of the 

size 5 of the neighborhood from Example 1. (a) Linear driving case; (b) 

nonlinear driving case. 
Figure 4: EGCI between two time series from Examples 2 and 3. (a) ODEs 

with a stable fixed point and different coupling strengths; (b) ODEs with 

chaotic behavior. 

Figure 5: Simulating two patterns of interactions in Figure 1 with three cou- 
pled AR models (Example 4). (a) Pairwise analysis results; (b) Conditional 
causality analysis separates the two cases. Dotted line gives the Conditional 
Extended Granger Causality Index (CEGCI) for c = 0.5 and the solid line 
for c = 0. 

Figure 6: Simulating two patterns of interactions in Figure 1 with three 
coupled chaotic Id- maps (Example 4). (a) Pairwise analysis results, the two 
patterns of interaction are not distinguished; (b) CEGCI analysis is able to 
distinguish between the two different patterns. Dotted line gives the CEGCI 
for c = 0.5 and the solid line for c = 0. 
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